Overview

Dataset statistics

Number of variables18
Number of observations23856
Missing cells182
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.9 MiB
Average record size in memory259.5 B

Variable types

NUM15
CAT2
BOOL1

Reproduction

Analysis started2020-06-15 14:26:45.827548
Analysis finished2020-06-15 14:27:56.042579
Duration1 minute and 10.22 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

DATE has a high cardinality: 9121 distinct values High cardinality
X_3 is highly correlated with X_2High correlation
X_2 is highly correlated with X_3High correlation
X_10 is highly skewed (γ1 = 34.9427132) Skewed
X_12 is highly skewed (γ1 = 30.61908319) Skewed
DATE is uniformly distributed Uniform
INCIDENT_ID has unique values Unique
X_1 has 19036 (79.8%) zeros Zeros
X_4 has 3335 (14.0%) zeros Zeros
X_5 has 4695 (19.7%) zeros Zeros
X_7 has 3461 (14.5%) zeros Zeros
X_8 has 8774 (36.8%) zeros Zeros
X_11 has 2553 (10.7%) zeros Zeros
X_12 has 5171 (21.7%) zeros Zeros
X_14 has 288 (1.2%) zeros Zeros
X_15 has 1017 (4.3%) zeros Zeros

Variables

INCIDENT_ID
Categorical

UNIQUE

Distinct count23856
Unique (%)100.0%
Missing0
Missing (%)0.0%
Memory size186.5 KiB
CR_92071
 
1
CR_74384
 
1
CR_180539
 
1
CR_109551
 
1
CR_101106
 
1
Other values (23851)
23851
ValueCountFrequency (%) 
CR_920711< 0.1%
 
CR_743841< 0.1%
 
CR_1805391< 0.1%
 
CR_1095511< 0.1%
 
CR_1011061< 0.1%
 
CR_457031< 0.1%
 
CR_810701< 0.1%
 
CR_473571< 0.1%
 
CR_1194871< 0.1%
 
CR_981611< 0.1%
 
CR_1811781< 0.1%
 
CR_1453521< 0.1%
 
CR_1237931< 0.1%
 
CR_1617371< 0.1%
 
CR_1189551< 0.1%
 
CR_210011< 0.1%
 
CR_192701< 0.1%
 
CR_769141< 0.1%
 
CR_1845741< 0.1%
 
CR_1134901< 0.1%
 
CR_1140671< 0.1%
 
CR_1505951< 0.1%
 
CR_1685221< 0.1%
 
CR_1558451< 0.1%
 
CR_1377711< 0.1%
 
Other values (23831)2383199.9%
 

Length

Max length9
Median length9
Mean length8.44714118
Min length4

Overview of Unicode Properties

Unique unicode characters13
Unique unicode categories (?)3
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
12404211.9%
 
C2385611.8%
 
R2385611.8%
 
_2385611.8%
 
4120716.0%
 
5120116.0%
 
7119845.9%
 
3119625.9%
 
6119315.9%
 
2119295.9%
 
8118615.9%
 
9114365.7%
 
0107205.3%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number12994764.5%
 
Uppercase Letter4771223.7%
 
Connector Punctuation2385611.8%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
C2385650.0%
 
R2385650.0%
 

Most frequent Connector Punctuation characters

ValueCountFrequency (%) 
_23856100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
12404218.5%
 
4120719.3%
 
5120119.2%
 
7119849.2%
 
3119629.2%
 
6119319.2%
 
2119299.2%
 
8118619.1%
 
9114368.8%
 
0107208.2%
 

Most occurring scripts

ValueCountFrequency (%) 
Common15380376.3%
 
Latin4771223.7%
 

Most frequent Latin characters

ValueCountFrequency (%) 
C2385650.0%
 
R2385650.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
12404215.6%
 
_2385615.5%
 
4120717.8%
 
5120117.8%
 
7119847.8%
 
3119627.8%
 
6119317.8%
 
2119297.8%
 
8118617.7%
 
9114367.4%
 
0107207.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII201515100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
12404211.9%
 
C2385611.8%
 
R2385611.8%
 
_2385611.8%
 
4120716.0%
 
5120116.0%
 
7119845.9%
 
3119625.9%
 
6119315.9%
 
2119295.9%
 
8118615.9%
 
9114365.7%
 
0107205.3%
 

DATE
Categorical

HIGH CARDINALITY
UNIFORM

Distinct count9121
Unique (%)38.2%
Missing0
Missing (%)0.0%
Memory size186.5 KiB
12-SEP-01
 
22
13-SEP-01
 
20
17-SEP-01
 
17
15-SEP-01
 
15
11-SEP-01
 
15
Other values (9116)
23767
ValueCountFrequency (%) 
12-SEP-01220.1%
 
13-SEP-01200.1%
 
17-SEP-01170.1%
 
15-SEP-01150.1%
 
11-SEP-01150.1%
 
26-SEP-01130.1%
 
16-SEP-01120.1%
 
14-AUG-0511< 0.1%
 
19-SEP-0111< 0.1%
 
28-SEP-0111< 0.1%
 
20-SEP-0111< 0.1%
 
18-SEP-0111< 0.1%
 
02-NOV-0011< 0.1%
 
18-NOV-1610< 0.1%
 
30-JUN-0610< 0.1%
 
14-SEP-0110< 0.1%
 
31-AUG-069< 0.1%
 
18-APR-069< 0.1%
 
23-SEP-019< 0.1%
 
17-MAY-019< 0.1%
 
01-MAY-929< 0.1%
 
04-FEB-129< 0.1%
 
13-JUL-009< 0.1%
 
12-NOV-169< 0.1%
 
24-SEP-079< 0.1%
 
Other values (9096)2356598.8%
 

Length

Max length9
Median length9
Mean length9
Min length9

Overview of Unicode Properties

Unique unicode characters30
Unique unicode categories (?)3
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
-4771222.2%
 
0205519.6%
 
1202539.4%
 
2124935.8%
 
9115585.4%
 
A100954.7%
 
U63803.0%
 
360782.8%
 
J60092.8%
 
N57052.7%
 
E55002.6%
 
751212.4%
 
650072.3%
 
849972.3%
 
547582.2%
 
446082.1%
 
P44042.1%
 
M41321.9%
 
R41041.9%
 
O39911.9%
 
C36331.7%
 
S22901.1%
 
L21571.0%
 
Y21421.0%
 
T21381.0%
 
Other values (5)88884.1%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number9542444.4%
 
Uppercase Letter7156833.3%
 
Dash Punctuation4771222.2%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
02055121.5%
 
12025321.2%
 
21249313.1%
 
91155812.1%
 
360786.4%
 
751215.4%
 
650075.2%
 
849975.2%
 
547585.0%
 
446084.8%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-47712100.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
A1009514.1%
 
U63808.9%
 
J60098.4%
 
N57058.0%
 
E55007.7%
 
P44046.2%
 
M41325.8%
 
R41045.7%
 
O39915.6%
 
C36335.1%
 
S22903.2%
 
L21573.0%
 
Y21423.0%
 
T21383.0%
 
G21102.9%
 
V18532.6%
 
F17152.4%
 
B17152.4%
 
D14952.1%
 

Most occurring scripts

ValueCountFrequency (%) 
Common14313666.7%
 
Latin7156833.3%
 

Most frequent Common characters

ValueCountFrequency (%) 
-4771233.3%
 
02055114.4%
 
12025314.1%
 
2124938.7%
 
9115588.1%
 
360784.2%
 
751213.6%
 
650073.5%
 
849973.5%
 
547583.3%
 
446083.2%
 

Most frequent Latin characters

ValueCountFrequency (%) 
A1009514.1%
 
U63808.9%
 
J60098.4%
 
N57058.0%
 
E55007.7%
 
P44046.2%
 
M41325.8%
 
R41045.7%
 
O39915.6%
 
C36335.1%
 
S22903.2%
 
L21573.0%
 
Y21423.0%
 
T21383.0%
 
G21102.9%
 
V18532.6%
 
F17152.4%
 
B17152.4%
 
D14952.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII214704100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
-4771222.2%
 
0205519.6%
 
1202539.4%
 
2124935.8%
 
9115585.4%
 
A100954.7%
 
U63803.0%
 
360782.8%
 
J60092.8%
 
N57052.7%
 
E55002.6%
 
751212.4%
 
650072.3%
 
849972.3%
 
547582.2%
 
446082.1%
 
P44042.1%
 
M41321.9%
 
R41041.9%
 
O39911.9%
 
C36331.7%
 
S22901.1%
 
L21571.0%
 
Y21421.0%
 
T21381.0%
 
Other values (5)88884.1%
 

X_1
Real number (ℝ≥0)

ZEROS

Distinct count8
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.4837776659959759
Minimum0
Maximum7
Zeros19036
Zeros (%)79.8%
Memory size186.5 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile3
Maximum7
Range7
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.439737889
Coefficient of variation (CV)2.976032152
Kurtosis13.65891063
Mean0.483777666
Median Absolute Deviation (MAD)0
Skewness3.789307148
Sum11541
Variance2.072845188
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
01903679.8%
 
1349714.7%
 
78763.7%
 
52701.1%
 
31360.6%
 
4260.1%
 
210< 0.1%
 
65< 0.1%
 
ValueCountFrequency (%) 
01903679.8%
 
1349714.7%
 
210< 0.1%
 
31360.6%
 
4260.1%
 
52701.1%
 
65< 0.1%
 
78763.7%
 
ValueCountFrequency (%) 
78763.7%
 
65< 0.1%
 
52701.1%
 
4260.1%
 
31360.6%
 
210< 0.1%
 
1349714.7%
 
01903679.8%
 

X_2
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count52
Unique (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean24.791205566733737
Minimum0
Maximum52
Zeros22
Zeros (%)0.1%
Memory size186.5 KiB

Quantile statistics

Minimum0
5-th percentile4
Q17
median24
Q336
95-th percentile49
Maximum52
Range52
Interquartile range (IQR)29

Descriptive statistics

Standard deviation15.24023098
Coefficient of variation (CV)0.6147434395
Kurtosis-1.30551524
Mean24.79120557
Median Absolute Deviation (MAD)13
Skewness-0.0947521072
Sum591419
Variance232.2646403
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
4402916.9%
 
3622329.4%
 
3321749.1%
 
2413445.6%
 
2112545.3%
 
379624.0%
 
499273.9%
 
459083.8%
 
37783.3%
 
226722.8%
 
476412.7%
 
166312.6%
 
95932.5%
 
395132.2%
 
254992.1%
 
54371.8%
 
64341.8%
 
444281.8%
 
403851.6%
 
193701.6%
 
263561.5%
 
302661.1%
 
422381.0%
 
172381.0%
 
182100.9%
 
Other values (27)23379.8%
 
ValueCountFrequency (%) 
0220.1%
 
1200.1%
 
21160.5%
 
37783.3%
 
4402916.9%
 
54371.8%
 
64341.8%
 
71660.7%
 
81040.4%
 
95932.5%
 
ValueCountFrequency (%) 
52190.1%
 
511030.4%
 
501600.7%
 
499273.9%
 
48550.2%
 
476412.7%
 
461810.8%
 
459083.8%
 
444281.8%
 
43690.3%
 

X_3
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count52
Unique (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean24.637449698189133
Minimum0
Maximum52
Zeros20
Zeros (%)0.1%
Memory size186.5 KiB

Quantile statistics

Minimum0
5-th percentile4
Q18
median24
Q335
95-th percentile49
Maximum52
Range52
Interquartile range (IQR)27

Descriptive statistics

Standard deviation15.1350925
Coefficient of variation (CV)0.6143124669
Kurtosis-1.237143987
Mean24.6374497
Median Absolute Deviation (MAD)13
Skewness-0.08212039854
Sum587751
Variance229.071025
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
4402916.9%
 
3422329.4%
 
3221749.1%
 
2413445.6%
 
2312545.3%
 
379624.0%
 
499273.9%
 
459083.8%
 
27783.3%
 
226722.8%
 
486412.7%
 
156312.6%
 
105932.5%
 
395132.2%
 
254992.1%
 
54371.8%
 
64341.8%
 
444281.8%
 
403851.6%
 
193701.6%
 
273561.5%
 
352661.1%
 
422381.0%
 
162381.0%
 
182100.9%
 
Other values (27)23379.8%
 
ValueCountFrequency (%) 
0200.1%
 
1220.1%
 
27783.3%
 
31160.5%
 
4402916.9%
 
54371.8%
 
64341.8%
 
71040.4%
 
81660.7%
 
92< 0.1%
 
ValueCountFrequency (%) 
52190.1%
 
511600.7%
 
501030.4%
 
499273.9%
 
486412.7%
 
47550.2%
 
461810.8%
 
459083.8%
 
444281.8%
 
43690.3%
 

X_4
Real number (ℝ≥0)

ZEROS

Distinct count10
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.276743796109994
Minimum0
Maximum10
Zeros3335
Zeros (%)14.0%
Memory size186.5 KiB

Quantile statistics

Minimum0
5-th percentile0
Q12
median4
Q36
95-th percentile10
Maximum10
Range10
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.944672067
Coefficient of variation (CV)0.6885313238
Kurtosis-1.013239087
Mean4.276743796
Median Absolute Deviation (MAD)2
Skewness0.1833932631
Sum102026
Variance8.671093584
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
6549723.0%
 
2479120.1%
 
0333514.0%
 
7289012.1%
 
420278.5%
 
318717.8%
 
913605.7%
 
1012425.2%
 
18413.5%
 
52< 0.1%
 
ValueCountFrequency (%) 
0333514.0%
 
18413.5%
 
2479120.1%
 
318717.8%
 
420278.5%
 
52< 0.1%
 
6549723.0%
 
7289012.1%
 
913605.7%
 
1012425.2%
 
ValueCountFrequency (%) 
1012425.2%
 
913605.7%
 
7289012.1%
 
6549723.0%
 
52< 0.1%
 
420278.5%
 
318717.8%
 
2479120.1%
 
18413.5%
 
0333514.0%
 

X_5
Real number (ℝ≥0)

ZEROS

Distinct count5
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.4556086519114686
Minimum0
Maximum5
Zeros4695
Zeros (%)19.7%
Memory size186.5 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median3
Q35
95-th percentile5
Maximum5
Range5
Interquartile range (IQR)4

Descriptive statistics

Standard deviation1.963094729
Coefficient of variation (CV)0.7994330562
Kurtosis-1.558871205
Mean2.455608652
Median Absolute Deviation (MAD)2
Skewness0.1752310231
Sum58581
Variance3.853740916
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
5736830.9%
 
1681828.6%
 
3497320.8%
 
0469519.7%
 
22< 0.1%
 
ValueCountFrequency (%) 
0469519.7%
 
1681828.6%
 
22< 0.1%
 
3497320.8%
 
5736830.9%
 
ValueCountFrequency (%) 
5736830.9%
 
3497320.8%
 
22< 0.1%
 
1681828.6%
 
0469519.7%
 

X_6
Real number (ℝ≥0)

Distinct count19
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.154175050301811
Minimum1
Maximum19
Zeros0
Zeros (%)0.0%
Memory size186.5 KiB

Quantile statistics

Minimum1
5-th percentile1
Q13
median5
Q38
95-th percentile15
Maximum19
Range18
Interquartile range (IQR)5

Descriptive statistics

Standard deviation4.471756047
Coefficient of variation (CV)0.7266215229
Kurtosis0.03760850344
Mean6.15417505
Median Absolute Deviation (MAD)3
Skewness0.9608294397
Sum146814
Variance19.99660214
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1346114.5%
 
5267911.2%
 
6262911.0%
 
423199.7%
 
1523189.7%
 
222989.6%
 
722869.6%
 
317087.2%
 
814055.9%
 
912675.3%
 
166202.6%
 
122100.9%
 
112000.8%
 
181620.7%
 
131390.6%
 
171100.5%
 
10250.1%
 
14180.1%
 
192< 0.1%
 
ValueCountFrequency (%) 
1346114.5%
 
222989.6%
 
317087.2%
 
423199.7%
 
5267911.2%
 
6262911.0%
 
722869.6%
 
814055.9%
 
912675.3%
 
10250.1%
 
ValueCountFrequency (%) 
192< 0.1%
 
181620.7%
 
171100.5%
 
166202.6%
 
1523189.7%
 
14180.1%
 
131390.6%
 
122100.9%
 
112000.8%
 
10250.1%
 

X_7
Real number (ℝ≥0)

ZEROS

Distinct count19
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.876509054325956
Minimum0
Maximum18
Zeros3461
Zeros (%)14.5%
Memory size186.5 KiB

Quantile statistics

Minimum0
5-th percentile0
Q12
median4
Q37
95-th percentile12
Maximum18
Range18
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.881930665
Coefficient of variation (CV)0.7960470538
Kurtosis0.493689765
Mean4.876509054
Median Absolute Deviation (MAD)3
Skewness0.7961675929
Sum116334
Variance15.06938569
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0346114.5%
 
6267911.2%
 
4262911.0%
 
223199.7%
 
1023189.7%
 
722989.6%
 
122869.6%
 
517087.2%
 
314055.9%
 
812675.3%
 
126202.6%
 
162100.9%
 
172000.8%
 
131620.7%
 
181390.6%
 
111100.5%
 
15250.1%
 
14180.1%
 
92< 0.1%
 
ValueCountFrequency (%) 
0346114.5%
 
122869.6%
 
223199.7%
 
314055.9%
 
4262911.0%
 
517087.2%
 
6267911.2%
 
722989.6%
 
812675.3%
 
92< 0.1%
 
ValueCountFrequency (%) 
181390.6%
 
172000.8%
 
162100.9%
 
15250.1%
 
14180.1%
 
131620.7%
 
126202.6%
 
111100.5%
 
1023189.7%
 
92< 0.1%
 

X_8
Real number (ℝ≥0)

ZEROS

Distinct count24
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9724597585513078
Minimum0
Maximum99
Zeros8774
Zeros (%)36.8%
Memory size186.5 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q31
95-th percentile3
Maximum99
Range99
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.453144468
Coefficient of variation (CV)1.494297789
Kurtosis952.9615467
Mean0.9724597586
Median Absolute Deviation (MAD)1
Skewness17.70384903
Sum23199
Variance2.111628843
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
11101046.2%
 
0877436.8%
 
222689.5%
 
39674.1%
 
44041.7%
 
52070.9%
 
6790.3%
 
7330.1%
 
8320.1%
 
10230.1%
 
9160.1%
 
1511< 0.1%
 
118< 0.1%
 
128< 0.1%
 
204< 0.1%
 
132< 0.1%
 
142< 0.1%
 
162< 0.1%
 
301< 0.1%
 
211< 0.1%
 
221< 0.1%
 
991< 0.1%
 
501< 0.1%
 
291< 0.1%
 
ValueCountFrequency (%) 
0877436.8%
 
11101046.2%
 
222689.5%
 
39674.1%
 
44041.7%
 
52070.9%
 
6790.3%
 
7330.1%
 
8320.1%
 
9160.1%
 
ValueCountFrequency (%) 
991< 0.1%
 
501< 0.1%
 
301< 0.1%
 
291< 0.1%
 
221< 0.1%
 
211< 0.1%
 
204< 0.1%
 
162< 0.1%
 
1511< 0.1%
 
142< 0.1%
 

X_9
Real number (ℝ≥0)

Distinct count7
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.924128101945003
Minimum0
Maximum6
Zeros118
Zeros (%)0.5%
Memory size186.5 KiB

Quantile statistics

Minimum0
5-th percentile2
Q15
median5
Q36
95-th percentile6
Maximum6
Range6
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.362624612
Coefficient of variation (CV)0.276724038
Kurtosis1.28166232
Mean4.924128102
Median Absolute Deviation (MAD)1
Skewness-1.525286754
Sum117470
Variance1.856745834
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
51055944.3%
 
6950839.9%
 
2304012.7%
 
34521.9%
 
11750.7%
 
01180.5%
 
44< 0.1%
 
ValueCountFrequency (%) 
01180.5%
 
11750.7%
 
2304012.7%
 
34521.9%
 
44< 0.1%
 
51055944.3%
 
6950839.9%
 
ValueCountFrequency (%) 
6950839.9%
 
51055944.3%
 
44< 0.1%
 
34521.9%
 
2304012.7%
 
11750.7%
 
01180.5%
 

X_10
Real number (ℝ≥0)

SKEWED

Distinct count24
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.244802146210597
Minimum1
Maximum90
Zeros0
Zeros (%)0.0%
Memory size186.5 KiB

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile2
Maximum90
Range89
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.119300682
Coefficient of variation (CV)0.8991795888
Kurtosis2190.137157
Mean1.244802146
Median Absolute Deviation (MAD)0
Skewness34.9427132
Sum29696
Variance1.252834017
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
12019884.7%
 
2269511.3%
 
35492.3%
 
42250.9%
 
5710.3%
 
6540.2%
 
8150.1%
 
10140.1%
 
97< 0.1%
 
77< 0.1%
 
114< 0.1%
 
123< 0.1%
 
202< 0.1%
 
152< 0.1%
 
301< 0.1%
 
221< 0.1%
 
191< 0.1%
 
401< 0.1%
 
501< 0.1%
 
181< 0.1%
 
581< 0.1%
 
171< 0.1%
 
901< 0.1%
 
161< 0.1%
 
ValueCountFrequency (%) 
12019884.7%
 
2269511.3%
 
35492.3%
 
42250.9%
 
5710.3%
 
6540.2%
 
77< 0.1%
 
8150.1%
 
97< 0.1%
 
10140.1%
 
ValueCountFrequency (%) 
901< 0.1%
 
581< 0.1%
 
501< 0.1%
 
401< 0.1%
 
301< 0.1%
 
221< 0.1%
 
202< 0.1%
 
191< 0.1%
 
181< 0.1%
 
171< 0.1%
 

X_11
Real number (ℝ≥0)

ZEROS

Distinct count133
Unique (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean206.95451877934272
Minimum0
Maximum332
Zeros2553
Zeros (%)10.7%
Memory size186.5 KiB

Quantile statistics

Minimum0
5-th percentile0
Q1174
median249
Q3249
95-th percentile316
Maximum332
Range332
Interquartile range (IQR)75

Descriptive statistics

Standard deviation93.03334801
Coefficient of variation (CV)0.4495352339
Kurtosis0.1944049511
Mean206.9545188
Median Absolute Deviation (MAD)67
Skewness-0.9032002688
Sum4937107
Variance8655.203842
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
174727530.5%
 
249693029.0%
 
316450018.9%
 
0255310.7%
 
3034381.8%
 
1273041.3%
 
742070.9%
 
1792060.9%
 
1021220.5%
 
2631030.4%
 
218980.4%
 
328790.3%
 
290760.3%
 
313680.3%
 
43590.2%
 
200590.2%
 
128580.2%
 
325570.2%
 
21450.2%
 
277450.2%
 
71360.2%
 
231340.1%
 
299300.1%
 
208300.1%
 
330290.1%
 
Other values (108)4151.7%
 
ValueCountFrequency (%) 
0255310.7%
 
13< 0.1%
 
61< 0.1%
 
115< 0.1%
 
121< 0.1%
 
162< 0.1%
 
201< 0.1%
 
21450.2%
 
253< 0.1%
 
312< 0.1%
 
ValueCountFrequency (%) 
3323< 0.1%
 
330290.1%
 
329210.1%
 
328790.3%
 
3271< 0.1%
 
325570.2%
 
32310< 0.1%
 
3221< 0.1%
 
3217< 0.1%
 
3202< 0.1%
 

X_12
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct count23
Unique (%)0.1%
Missing182
Missing (%)0.8%
Infinite0
Infinite (%)0.0%
Mean0.974064374419194
Minimum0.0
Maximum90.0
Zeros5171
Zeros (%)21.7%
Memory size186.5 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median1
Q31
95-th percentile2
Maximum90
Range90
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.167725118
Coefficient of variation (CV)1.198817192
Kurtosis1880.955431
Mean0.9740643744
Median Absolute Deviation (MAD)0
Skewness30.61908319
Sum23060
Variance1.363581951
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
11567465.7%
 
0517121.7%
 
220398.5%
 
34762.0%
 
41760.7%
 
5590.2%
 
6360.2%
 
89< 0.1%
 
107< 0.1%
 
96< 0.1%
 
114< 0.1%
 
74< 0.1%
 
152< 0.1%
 
202< 0.1%
 
581< 0.1%
 
401< 0.1%
 
161< 0.1%
 
171< 0.1%
 
901< 0.1%
 
121< 0.1%
 
301< 0.1%
 
141< 0.1%
 
501< 0.1%
 
(Missing)1820.8%
 
ValueCountFrequency (%) 
0517121.7%
 
11567465.7%
 
220398.5%
 
34762.0%
 
41760.7%
 
5590.2%
 
6360.2%
 
74< 0.1%
 
89< 0.1%
 
96< 0.1%
 
ValueCountFrequency (%) 
901< 0.1%
 
581< 0.1%
 
501< 0.1%
 
401< 0.1%
 
301< 0.1%
 
202< 0.1%
 
171< 0.1%
 
161< 0.1%
 
152< 0.1%
 
141< 0.1%
 

X_13
Real number (ℝ≥0)

Distinct count60
Unique (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean85.23738262910798
Minimum0
Maximum116
Zeros1
Zeros (%)< 0.1%
Memory size186.5 KiB

Quantile statistics

Minimum0
5-th percentile18
Q172
median98
Q3103
95-th percentile112
Maximum116
Range116
Interquartile range (IQR)31

Descriptive statistics

Standard deviation27.59722639
Coefficient of variation (CV)0.3237690499
Kurtosis1.093046857
Mean85.23738263
Median Absolute Deviation (MAD)11
Skewness-1.388636749
Sum2033423
Variance761.6069043
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
103699529.3%
 
72447618.8%
 
92325513.6%
 
11221168.9%
 
9813665.7%
 
188513.6%
 
1095372.3%
 
245232.2%
 
124271.8%
 
593481.5%
 
343421.4%
 
1162881.2%
 
542341.0%
 
1132250.9%
 
1112150.9%
 
672110.9%
 
22100.9%
 
422000.8%
 
481720.7%
 
871500.6%
 
1101470.6%
 
841450.6%
 
97760.3%
 
31640.3%
 
89510.2%
 
Other values (35)2321.0%
 
ValueCountFrequency (%) 
01< 0.1%
 
15< 0.1%
 
22100.9%
 
71< 0.1%
 
82< 0.1%
 
99< 0.1%
 
10460.2%
 
124271.8%
 
131< 0.1%
 
171< 0.1%
 
ValueCountFrequency (%) 
1162881.2%
 
115210.1%
 
114160.1%
 
1132250.9%
 
11221168.9%
 
1112150.9%
 
1101470.6%
 
1095372.3%
 
1086< 0.1%
 
103699529.3%
 

X_14
Real number (ℝ≥0)

ZEROS

Distinct count62
Unique (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean72.67429577464789
Minimum0
Maximum142
Zeros288
Zeros (%)1.2%
Memory size186.5 KiB

Quantile statistics

Minimum0
5-th percentile29
Q129
median62
Q3107
95-th percentile142
Maximum142
Range142
Interquartile range (IQR)78

Descriptive statistics

Standard deviation43.2973203
Coefficient of variation (CV)0.5957721342
Kurtosis-1.324908795
Mean72.67429577
Median Absolute Deviation (MAD)33
Skewness0.2455877663
Sum1733718
Variance1874.657945
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
29816534.2%
 
93311013.0%
 
142271411.4%
 
62247410.4%
 
8014886.2%
 
13012055.1%
 
1077343.1%
 
146572.8%
 
1195792.4%
 
1035062.1%
 
874551.9%
 
1333561.5%
 
02881.2%
 
531770.7%
 
1381370.6%
 
1151300.5%
 
1241250.5%
 
61190.5%
 
25770.3%
 
140740.3%
 
136660.3%
 
77570.2%
 
24210.1%
 
76190.1%
 
57180.1%
 
Other values (37)1050.4%
 
ValueCountFrequency (%) 
02881.2%
 
21< 0.1%
 
61190.5%
 
121< 0.1%
 
146572.8%
 
162< 0.1%
 
24210.1%
 
25770.3%
 
29816534.2%
 
303< 0.1%
 
ValueCountFrequency (%) 
142271411.4%
 
140740.3%
 
13910< 0.1%
 
1381370.6%
 
136660.3%
 
1333561.5%
 
13012055.1%
 
129170.1%
 
1284< 0.1%
 
1241250.5%
 

X_15
Real number (ℝ≥0)

ZEROS

Distinct count28
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean33.46474681421864
Minimum0
Maximum50
Zeros1017
Zeros (%)4.3%
Memory size186.5 KiB

Quantile statistics

Minimum0
5-th percentile23
Q134
median34
Q334
95-th percentile46
Maximum50
Range50
Interquartile range (IQR)0

Descriptive statistics

Standard deviation8.38683369
Coefficient of variation (CV)0.2506169772
Kurtosis8.7395923
Mean33.46474681
Median Absolute Deviation (MAD)0
Skewness-2.527453789
Sum798335
Variance70.33897934
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
341894779.4%
 
4315036.3%
 
010174.3%
 
466682.8%
 
236422.7%
 
485212.2%
 
361820.8%
 
501450.6%
 
9920.4%
 
39540.2%
 
24200.1%
 
38200.1%
 
18130.1%
 
406< 0.1%
 
416< 0.1%
 
174< 0.1%
 
44< 0.1%
 
152< 0.1%
 
321< 0.1%
 
161< 0.1%
 
311< 0.1%
 
351< 0.1%
 
51< 0.1%
 
211< 0.1%
 
81< 0.1%
 
Other values (3)3< 0.1%
 
ValueCountFrequency (%) 
010174.3%
 
44< 0.1%
 
51< 0.1%
 
81< 0.1%
 
9920.4%
 
121< 0.1%
 
141< 0.1%
 
152< 0.1%
 
161< 0.1%
 
174< 0.1%
 
ValueCountFrequency (%) 
501450.6%
 
485212.2%
 
466682.8%
 
4315036.3%
 
416< 0.1%
 
406< 0.1%
 
39540.2%
 
38200.1%
 
361820.8%
 
351< 0.1%
 
Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size186.5 KiB
1
22788
0
 
1068
ValueCountFrequency (%) 
12278895.5%
 
010684.5%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

Sample

First rows

INCIDENT_IDDATEX_1X_2X_3X_4X_5X_6X_7X_8X_9X_10X_11X_12X_13X_14X_15MULTIPLE_OFFENSE
0CR_10265904-JUL-040363421561611741.09229360
1CR_18975218-JUL-17137370011171612361.0103142341
2CR_18463715-MAR-1703235102311741.011093341
3CR_13907113-FEB-090333221711612491.07229341
4CR_10933513-APR-050333221830511740.011229431
5CR_9626307-APR-0304545103101613031.07262341
6CR_13140022-JAN-080303573710511740.011229431
7CR_1198114-MAY-9308773980513161.07262341
8CR_18413421-AUG-160494965831113161.010314341
9CR_3263425-AUG-961446515100521451.010329340

Last rows

INCIDENT_IDDATEX_1X_2X_3X_4X_5X_6X_7X_8X_9X_10X_11X_12X_13X_14X_15MULTIPLE_OFFENSE
23846CR_7972417-SEP-01136342115100512491.092130341
23847CR_3803302-MAY-960333221561622492.010393341
23848CR_1438402-DEC-930262790350512491.0112130341
23849CR_6895325-APR-007252590980512491.07293341
23850CR_3320111-JUL-96044651026101.07229341
23851CR_8899111-JAN-02147487315101511740.09829341
23852CR_4636905-FEB-970333221560511740.011229431
23853CR_15755603-APR-120252590351611740.01029181
23854CR_10318025-JAN-040393965271611270.0112103431
23855CR_2257508-NOV-947363421980512491.09229341